MS&E 336 Lecture 14: Approachability and regret minimization
نویسنده
چکیده
j 6=i Aj . We let ai denote a pure action for player i, and let si ∈ ∆(Ai) denote a mixed action for player i. We will typically view si as a vector in R Ai , with si(ai) equal to the probability that player i places on ai. We let Πi(a) denote the payoff to player i when the composite pure action vector is a, and by an abuse of notation also let Πi(s) denote the expected payoff to player i when the composite mixed action vector is s. The game is played repeatedly by the players. We let h = (a, . . . ,a) denote the history up to time T . The external regret of player i against action si after history h is:
منابع مشابه
MS&E 336 Lecture 15: Calibration
Calibration is a concept that tries to formalize a notion of quality for forecasters. For example, suppose a weatherman predicts each day whether the it will rain, or be sunny. Typically forecasters will predict such events in terms of probabilities, i.e., “There is a 30% chance of rain.” Given only the outcome that day, it is impossible to judge the quality of such a forecast. However, if we c...
متن کاملRobust approachability and regret minimization in games with partial monitoring
Approachability has become a standard tool in analyzing learning algorithms in the adversarial online learning setup. We develop a variant of approachability for games where there is ambiguity in the obtained reward that belongs to a set, rather than being a single vector. Using this variant we tackle the problem of approachability in games with partial monitoring and develop simple and efficie...
متن کاملResponse-Based Approachability and its Application to Generalized No-Regret Algorithms
Approachability theory, introduced by Blackwell (1956), provides fundamental results on repeated games with vector-valued payoffs, and has been usefully applied since in the theory of learning in games and to learning algorithms in the online adversarial setup. Given a repeated game with vector payoffs, a target set S is approachable by a certain player (the agent) if he can ensure that the ave...
متن کاملZero - Sum Games with Vector - Valued Payoffs
In this lecture we formulate and prove the celebrated approachability theorem of Blackwell, which extends von Neumann's minimax theorem to zero-sum games with vector-valued payoffs [1]. (The proof here is based on the presentation in [2]; a similar presentation was given by Foster and Vohra [3].) This theorem is powerful in its own right, but also has significant implications for regret minimiz...
متن کاملBlackwell Approachability and No-Regret Learning are Equivalent
We consider the celebrated Blackwell Approachability Theorem for two-player games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimizati...
متن کامل